NOTICE 


THIS DOCUMENT HAS BEEN REPRODUCED FROM 
MICROFICHE. ALTHOUGH IT IS RECOGNIZED THAT 
CERTAIN PORTIONS ARE ILLEGIBLE, IT IS BEING RELEASED 
IN THE INTEREST OF MAKING AVAILABLE AS MUCH 
INFORMATION AS POSSIBLE 



AgRISTARS 


E 81 “10 182 


SR-L0-00478 

JSC-16378 

JAn ? 1 1981 


- I(fi0^(p5 


"Made available under N A ; A sponsorship 
in t u e interest (it early and 'Aide dis- 
semination of Earth Resources Survey 
Pro; am intcrmation and without liability 
tor any use made thereof." 



Supporting Research 


A Joint Program for 
Agriculture and 
Resources Inventory 
Surveys Through 
Aerospace 
Remote Sensing 

November 1980 


THE MULTICATEGORY CASE OF THE SEQUENTIAL 
BAYESIAN PIXEL SELECTION AND 
ESTIMATION PROCEDURE 

M. D. Pore and T. B. Dennis 


No l-t^ao 


dOCiJ — 

j i/ 4 3 0 u i 6 c. 


Lockheed Engineering and Management Services Company, Inc. 
1830 NASA Road 1, Houston, Texas 77058 


(^ol - 101o2) ItiE tl JL1 jX A 1 1: jUh i OAj c c, t Tat 
JL *U £ NT 1 AL licit t IX Ll bi.LtCli.oN AND 
L.i i I.1A11 C N LtiOC taUuc (iOUUieoa cl* j 1 lit c i. In 4 
duo .Idh d gouitiU t) *. 1 . ^ uC mJI CboL 1*.A 



NASA 








Lyndon B. Johnson Space Center 

Houston. Texas 77058 



2. Government Accern on No 


3. Recipient i Catalog No 


JSC-16378; SR-L0-00478 


4. Title end Subtitle 

The Multicategory Case of the Sequential Bayesian Pixel 
'''Selection and Estimation Procedure 


5. Report Dett 

(, . November 1980 


6 Performing Organization Code 


7 Authorltl 

v M. D. Pore and T. B. Dennis 

Lockheed Engineering and Management Services Company, Inc. 


8 Performing Organization Reoort No 

1 -LEMSC0-14807 


10 Work Unit No 


9 Reforming Organization Name and Addrna 

j(- Lockheed Engineering and Management Services Company, 
1830 NASA Road 1 
Houston, Texas 77058 


Inc. 


11 Contract or Grant No 

. NAS 9-15800 


12 Spontormg Agency Name and Add'Hi 

National Aeronautics and Space Administration 
Lyndon B. Johnson Space Center »■ 

Houston, Texas 77058 Technical Monitor: J. 


13 Typa of Report and Period Covered 

Technical Report 


1* Sponiormg Agency Code 


D. Erickson 5W 


15. Supplementary Notet 


18. Abatract 

A Bayesian technique for stratified proportion estimation and a sampling procedure based 
on minimizing the mean squared error of this estimator have been developed and tested 
on Landsat mul ti spectral scanner data using the beta density function to model the prior 
distribution in the two-class case. In this paper, an extension of this procedure to 
the k-class case is considered. A generalization of the beta function is shown to be a 
density function for the general case which allows the procedure to be e*tended. 


17. Key Wordt lSugge»tad by Authorltil 

Bayesian techniques 
Stratified proportion estimation 
Sequential allocation 

18 Ontnbution Statement 

19 Security Oiaif (of thi» report l 

Unclassified 

20 Security Cleuif (of thu page! 

Unclassified 

21. No of Pegm 
21 

22 Ruee* 


‘For uli by ft* National Technical Information Service. Springfield. Virginia 22151 
JSCFermU24IPevNe.nl NASA — JSC 
















SR-L0-00478 

JSC-16378 


THE MULTICATEGORY CASE OF THE SEQUENTIAL BAYESIAN PIXEL 
SELECTION AND ESTIMATION PROCEDURE 

Job Order 73-306 

This report describes Classification activities 
of the Supporting Research project of the AgRISTARS program. 

PREPARED BY 

M. D. Pore and T. B. Dennis 


APPROVED BY 


7 C. 


T. C. Minter, Supervisor 
Techniques Development Section 



u. 




J », E. Wainwright, Manager 
Development and Evaluation Department 



LOCKHEED ENGINEERING AND MANAGEMENT SERVICES COMPANY, INC. 
Under Contract NAS 9-15800 
For 

Earth Observations Division 

Space and Life Sciences Directorate 

NATIONAL AERONAUTICS AND SPACE ADMINISTRATION 
LYNDON B. JOHNSON SPACE CENTER 
HOUSTON, TEXAS 

November 1980 


LEMSCO- 14807 


CONTENTS 


Section Page 

1. INTRODUCTION 1 

2. THE THREE CATEGORY CASE 3 

3. THE K-CATEGORY CASE 6 

4. REMARKS 13 

5. SUMMARY 17 

6. REFERENCES 19 


1. INTRODUCTION 


A Bayesian technique for stratified proportion estimation and a sequential 
sampling procedure based on minimizing the mean squared error (MSE ) of the 
posterior Bayesian estimator was developed by Pore (ref. 1) and tested by 
Lennington and Johnson (ref. 2) for the two-category case. The most favorable 
results were obtained when the prior distribution was modeled as a beta dens- 
ity function. These favorable results s f emmed from a combination of the math- 
ematical ease in developing the est n*.or and theoretical MSE, the ability to 
fairly closely model the empirical prior distribution with the beta, and the 
high accuracy in the data analysis. Virtually no bias and an MSE less than 
the proportional allocation case were reported. These results were obtained 
from analyses using Land Satellite (Landsat) multi spectral scanner (MSS) data 
in which stratification was achieved by clustering picture elements (pixels) 
in a 9- by 11-kilometer area referred to as a segment. The two categories 
used were predominantly small-grains agricultural crops and nonsmall grains. 

In section 2, the Bayesian development is presented for the three-category 
case, and in section 3, it is generalized to the k-category case. The three- 
category case might be used where, for example, barley is to be estimated 
within the small-grains category. A procedure of directly estimating barley, 
other small grains, and nonsmall grains might be tested if labeling practices 
allowed the direct labeling of barley and other small grains. 

The k-category case in section 3 is presented for completeness and to document 
the results for future crop estimation possibilities. 

The environment of these developments is as follows: 

a. The segment (population) has been clustered (stratified) into several 
subgroups , 

b. Pixels (samples) can be selected randomly within each cluster, and 

c. The clustering of segments (with a given algorithm) has been performed in 
the past and compared to the actual labels of the pixels. Furthermore, 


the clustering algorithm performs somewhat uniformly across segments; that 
is, the rates at which different purities of clusters are generated is 
approximately the same from segment to segment. 

Sections 2 and 3 present the development of estimators for the proportion 
estimation of categories within a cluster. The estimator is then applied 
separately to each cluster to obtain segment-level proportion estimates. The 
MSE is obtained in the same manner. Remarks in section 4 give additional 
information about obtaining segment-level estimates. 

Within a cluster, the true proportion of category i is denoted 0^, the 
estimated proportion, 0^ and denotes the number of pixels labeled as 
category i. 
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2 . THE THREE-CATEGORY CASE 


In the three-category case, 0^ + © 2 + 0 3 = 1 , and the conditional distribution 
of xj, X2* and X3 is 


f(x 1 ,x 2 ,x 3 |0 1 ,0 2 ,0 3 ) = !x 'x ' 


(x l * x 2 * x 3 )! X 1 x 2 x 3 
x, !x„!x.J 123 


r A 2 ,A 3‘ 


x ! x ? 

Mq 0 ! 0 2 ‘ (1 - e 


1 




where 0 ^( 0 , 1 ) and x^O, 1 , •••)• This is a multinomial model: a 

generalization of the binomial model used in the two-category case. 


We assume that, frcm previous experience with the clustering algorithm, the 
distribution of the array (0^,0 2 ,0 3 ) of cluster proportions can be modeled as 

9<VV 9 3> * K O 0 l' 8 2 29 3 3 

di 3 

» K o 0 1 ® 2 (1 - 

a l ,a 2 ,a 3 ^ C0.1], ® 2 c CO , 1 - 9^ ] 

r(a 1 + a 2 + a 3 + 3 ) 

K 0 = r(aj + 1 ) r ( a 2 ♦ 1 ) r ( a 3 * 1) 

The proofs that f and g are indeed probability density functions (pdf's) are 
given in section 3 . 


where 


and 


Now using the notation 0 = (e^,0 2 ,0 3 ) and X = (x^,x 9 ,x 3 ) 

h (0 1 X ) = 


where 


p(X) = 


1 yl-0. 


a 


g( Q ) f ( X I O)d0 1 d0 2 


: M |( 

TTO 


r(xj ♦ aj ♦ l)r(x 2 ♦ a 2 * l)r(x 3 ♦ a 3 ♦ 1) 




X , ♦ X, 


a, ■* a~ a-. 


T" 
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Now 


e T * e(q 1 |X) 


1 /-1-0, 


■ff 

J o J o 


0 1 h(0|X)d0 1 d6 


r v 2 


x^ + ♦ 1 


X 1 + x 2 * x 3 + a l * a 2 + a 3 + ^ 


A rl-Qo 

E(e 2 |x) = I / Q 2 h(o|x)d9 1 de 2 

Jn Jn 


'0 •'o 


x 2 + a 2 + 1 


X 1 * *2 * *3 * a l * a 2 * a 3 + 3 


/*! /*1" 9 o 

= E(1 - e 1 - e 2 1 x) = I I (1 - - 9 2 )h(o|X)d9 1 de 2 

•'0 -'0 


= x 3 * a 3 + 1 

x l + x 2 + x 3 + a i + a 9 + a 3 + 3 

Assuming Nq = + X£ + *3 is fixed. expressions are easily derived for the 

bias, variance, and mean square error ( MSE ) : 


bias ( 9 i ) 


Var ( 9 .) 


A 0 = a l 

+ a 2 

+ a- 

- X . 

+ a • 

♦ 1 

9 = -L 

i 

i N n 

♦ L 

+ 3 


, Vi * a i * 1 

1 n q * s 0 * 3 


'1 


* 1 - ©,(A 


0 


E(9 i - V = -- *1 ■ , 

'*0 0 


. - ~ x. - N rt 9 . 

E(6 1 - E 9 i > * E -i \ A V - > 

1 1 '*0 0 J 


3) 


2 


__ m q 5 1 (l - e 2 ) 

' < N o ’ A o * 3,2 


4 
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3. THE K-CATEGORY CASE 


The K-category case is merely an extension of the three-category case. Proofs 
have been omitted from section 2 since they are special cases of those 
presented in this section. 


We begin by assuming that the prior distribution, the distribution of the 
array 0 = (0^ , 0 2 , •••, 9^), can be modeled as a generalized beta pdf. 


Theorem 1: The function 


k a. 

9(0) = g(9 1 . ,, *.9 k ) s K • n 0. 

( k-1 v a. k-1 a. 

k n V 


where 


P' 


1 , 0 > 0 for i e { 1 , • • • , k } 


and K 


~T 

n r(a, ♦ l) 


is a probability density function for each set of {a.} such that 
a^ > -1, i e U, •••, k}. 


Proof : The function is obviously nonnegative and continuous for 0 < 9^ < 1 

for each i. Hence, it remains only to show that it integrates to 1. Notice 
that if k = 2, g reduces to the well-known beta pdf; i.e., the theorem is true 
for k * 2 si. ice 




(1 - t) 



r(a i * l)r(a 2 ♦ 1) 
r(a 1 ♦ a 2 ♦ 2) 


for any choices of aj, and a-? > -1. 
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• s,if •'-? i 



From here, we proceed by induction on Sc. We assume that the theorem is true 
for k * n; i .e. , 



for all choices of a^, a.,,***, a n > -1. 


We then use the substitution t = 


to evaluate the integral in 


£ 9 j 
j-i J 


question for the values a^ , ”*» a n+ i 1n the case k * n ♦ 1. This 

integral is given in equation (3-1) on page 8. 


By the substitution of the values a n , a n> ^, in the known case k = 2, and the 
values a^ , a£, •••, a n+ ^t and (a n ♦ a n +i ♦ 1) in the assumed induction 
hypothesis, this integral reduces to 


"m r!a < * u ] r<a " * * 21 A r!a < * n 
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The conditional distribution of the observed frequencies X « (xj. x 2 , x k ), 
given the true proportions 3 B (6^, •••, 9^), is the well-known multinomial 
distribution: 


k x, / k-1 \x. k-1 x. 

f ( X | Q ) - M • fl 9 i 1 - M 1 - £ 9jJ fl 0 i 


where 0 < 9^ < 1, 9^ * 1, x^ c {J, 1, 2, •••} and 



n r ( * ,• * i) n 

i j i j 


Now the posterior distribution of 0 is 


where 



P ( X ) 


K • M 


n r( Xi * a i 


£ Cxj ♦ 

i J 


♦ 1) 


* 1) 


Theorem 2: The marginal distribution of X is p, given above, when the prior 
distribution of G is the generalized beta pdf given by g, and the conditional 
distribution of X is the multinomial f. 


Proof : The joint distribution of X and 0 can be expressed in terns of g and f 


as 

and 


t(O.X) * g(0) • f ( X • 0) 


p( X ) 


J 


t(0,X)dO 


/ 


g (0 ) f ( X | o )d© 



g(0)f (X|o)d9 |t _ 1 


d9-d9. 

C 1 
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where 


g(0)f(X|0) = K • M 



r- 


k-1 

n 

i 


8 X ' ta ' 

1 


From the induction hypothesis proven in Theorem 1, it is seen that g(O)f(X|0) 
integrates to 


P(X) = K • M 


n fix, 

i=l 1 


+ a 


i + 1 > 


E ( x i + a + 1) 

i = l 1 1 


Now, using the same integration techniques, we derive the estimators. 

k k 

Theorem 3: For f and g, as defined above, and using N = E and A = E a,. 


i=l 


i =1 


0 

P 


E(9 p |X) = 


/ 


9 p h(0|X)dO 



+ 3 P + 
TT 


N + 


k 


E ( x i + a i + 1) 
1 1 1 


1 


for each p e ( 1 , ••• , k}. 
Proof: It can be seen that 


r 9 r 

r p 

1 A hi Pi 1 Y ^ HTA - 

' k 

E (*i + a. + 1) 

i=l 1 1 

(■ • t •>) 

, x, +a, k 

k k n 5 x f* a f 

1 "i 1 

/ p v ' ' k 


k 

n r(x. ♦ a, ♦ 1 ) 

i = l 
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4. REMARKS 


The cluster-specific results presented in sections 2 and 3 can be assimilated 
into segment- level statistics by the following equations: 

s = the number of clusters or strata 

Mq = the number of pixels (samples) in cluster (strata) q 

ToT = the total number of pixels in the segment 


• t *. 


i=l 

= the true proportion of category i in strata q 
P^ = the proportion of pixels in category i in the segment 


• z 


S M 


£1 ToT ' 9 ' *1 
A 

0^,q = the estimated proportion of category i in strata q 

= the estimated proportion of category i in the segment 
s M 


= y _ 9 _ . 0 
<Pi ToT V ' 


biaS = q ?i ToT bias (6 i’q ) 
Var ( V = (tot) Var (e i’q } 
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MSE (P.) •- Var(P.) + [bias (P^] 2 


■ 

q=l ' 

\ 2 - [x 

j) Var <Vq> * 5 
• 1 * 

ToT • b1as ( V 

II 

IS? 

t) 2 Var <Vq> * 

'M. \2 

v 1ot) CbiaS 9 i’q ] 


s s MM. 

+ £ E 2 ^ « ■ bi as (9. , ) bias (8j,:) 
q=l j=q+l ToT 


■ MSE< V,> 

s s MM. 

+ E E 2 -3-i bias (6 1 , ) bias (e i , i ) 
q-1 j=q + l Tor q J 

One application of the theory developed in this report is to randomly select a 
predesignated number of pixels from a segment, note the pixel labels and 
breakdown by clusters, and implement the Bayesian approach (above) to calcu- 

A A A 

late 9-, (i = 1, •••, k), P., and MSE (P.). One problem with this approach is 

1 ^ II A 

that each cluster may not contain two samples; thus, MSE (0. , ) cannot be 

A ’ H 

estimated, and the MSE evaluation of the estimator, P i , will not exist in this 
case. Another problem is that the samples may be inefficiently allocated to 

A 

obtain a small MSE (P^). In an attempt to resolve these problems, the alter- 
nate sampling strategy of sampling in proportion to cluster size can be used. 

A 

Again, however, since the MSE ( 0 - , ^ ) is a function of cluster size, number of 
samples, and the proportion 0.j, , the optimal sampling strategy will depend on 
cluster purity (as well as size). Sampling in proportion to cluster size 
cannot be optimal. The following approach is a first attempt at addressing 
the problem of stratified sampling within a segment. 
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In the two-category case, two samples were selected from each cluster (to 
insure an estimate of the variance). Then, additional samples were selected 
sequentially so that, at each sampling, the sample was selected from the 
cluster that was expected to maximally minimize the weighted cluster MSE for 
the one proportion estimate. The weighting is the square of the cluster size 
as a proportion of the segment. Therefore, the expected change for each 
cluster q is as fol lows. 


9 i’q = 9 i * q ( n,x ) = N + A + k 


Vo + a i + 1 
» -i-9 1 ; k = 2 


MSE’ 


/ M \ 2 (N0. , (n,x)[l - 9., (n. 
Ce i , Q (n,x)] = (JU 
1 q UoT/ ( (N A + kr 


x)] 


+ [a. + 1 - 0J, (n,x) • (A + k)]' 


I 


q' | 

AMSE* = MSE*[9(n,x)] - [1 - 9(n,x)] ♦ MSE*[9(n + i,x)] 


- 9(n,x) • MSE*[9(n + 1, x + l)] 


Notice that aMSE* is a function of the crop being estimated, though this is 
hidden since there are only two categories. In the k-category case, this 
dependence can be averaged out for each cluster q by using 

k 

a = Z AMSE*(9. , n ) 
i =1 1 q 

For k - 2, A = AMSE*(9-,q) for either i, and this problem does not exist. 

Also, although AMSE* is the weighted cluster MSE, it does not exactly repre- 
sent the cluster contribution to the segment MSE: MSE( P ). It would be 

A 

preferable to calculate a AMSE(P.j) for each cluster and sampling, but earlier 
experiments used AMSE*(8.,^) as a computational expedience and an 
approximation toAMSE(P-). 

A 

The exact relationship of the two is given in the last MSE (P^ ) equation given 
above. 
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The aMSE criterion, either aMSE* ( 9 i , q ) or AMSE (P i ) • would appear to be the 
optimum approach in extending to multicategory (k > 2) proportion estimation 
also. The unresolved issue is the determination of which categories to 
include and by what weighting. 

k 

That is AMSE(P)q - £ a. AMSE(P. , ) 

i=l 1 1 q 

k 

a, > 0, £ a. = 1 

1 i = l 1 


or 


AMSE*(9)q 



^SE*(9., q ) 


The weightings {a^} will determine the relative importance of the respective 
crops, or vice versa. Another possibility would be to select from the cluster 

A 

q with the largest AMSE*( e .j » q ) . i = 1, 2, •••, k. The particular criterion 
selected should be tailored to each specific application and determined 
through empirical studies. 
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5. SUMMARY 


A Bayesian technique for stratified proportion estimation is presented for the 
multicategory case, and detailed equations are derived for the case of a 
generalized beta prior distribution. Additionally, a technique of sequen- 
tially sampling from the clusters to achieve minimum mean squared error 
segment proportion estimates for the categories of interest was presented, and 
some computational issues were identified. 
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